Optimized Distributed Hyperparameter Search and Simulation for Lung Texture Classification in CT Using Hadoop
نویسندگان
چکیده
Many medical image analysis tasks require complex learning strategies to reach a quality of image-based decision support that is sufficient in clinical practice. The analysis of medical texture in tomographic images, for example of lung tissue, is no exception. Via a learning framework, very good classification accuracy can be obtained, but several parameters need to be optimized. This article describes a practical framework for efficient distributed parameter optimization. The proposed solutions are applicable for many research groups with heterogeneous computing infrastructures and for various machine learning algorithms. These infrastructures can easily be connected via distributed computation frameworks. We use the Hadoop framework to run and distribute both grid and random search strategies for hyperparameter optimization and cross-validations on a cluster of 21 nodes composed of desktop computers and servers. We show that significant speedups of up to 364× compared to a serial execution can be achieved using our in-house Hadoop cluster by distributing the computation and automatically pruning the search space while still identifying the best-performing parameter combinations. To the best of our knowledge, this is the first article presenting practical results in detail for complex data analysis tasks on such a heterogeneous infrastructure together with a linked simulation framework that allows for computing resource planning. The results are directly applicable in many scenarios and allow implementing an efficient and effective strategy for medical (image) data analysis and related learning approaches.
منابع مشابه
Automated classification of pulmonary nodules through a retrospective analysis of conventional CT and two-phase PET images in patients undergoing biopsy
Objective(s): Positron emission tomography/computed tomography (PET/CT) examination is commonly used for the evaluation of pulmonary nodules since it provides both anatomical and functional information. However, given the dependence of this evaluation on physician’s subjective judgment, the results could be variable. The purpose of this study was to develop an automated scheme for the classific...
متن کاملAdaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملLung Texture Classification Using Locally-Oriented Riesz Components
We develop a texture analysis framework to assist radiologists in interpreting high-resolution computed tomography (HRCT) images of the lungs of patients affected with interstitial lung diseases (ILD). Novel texture descriptors based on the Riesz transform are proposed to analyze lung texture without any assumption on prevailing scales and orientations. A global classification accuracy of 78.3%...
متن کاملEvaluation of Lung Density and Its Dosimetric Impact on Lung Cancer Radiotherapy: A Simulation Study
Background: The dosimetric parameters required in lung cancer radiation therapy are taken from a homogeneous water phantom; however, during treatment, the expected results are being affected because of its inhomogeneity. Therefore, it becomes necessary to quantify these deviations.Objective: The present study has been undertaken to find out inter- and intra- lung density variations and its dosi...
متن کاملSentiment Analysis of Social Networking Data Using Categorized Dictionary
Sentiment analysis is the process of analyzing a person’s perception or belief about a particular subject matter. However, finding correct opinion or interest from multi-facet sentiment data is a tedious task. In this paper, a method to improve the sentiment accuracy by utilizing the concept of categorized dictionary for sentiment classification and analysis is proposed. A categorized dictiona...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Imaging
دوره 2 شماره
صفحات -
تاریخ انتشار 2016